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^^ , Abstract 

r^ ' We consider the problem of exact identification for read-once functions over arbi- 

trary Boolean bases. We introduce a new type of queries {subcube identity ones), discuss 
r~^ ' its connection to previously known ones, and study the complexity of the problem in 

question. Besides these new queries, learning algorithms are allowed to use classic mem- 
bership ones. We present a technique of modeling an equivalence query with a polynomial 
number of membership and subcube identity ones, thus establishing (under certain con- 
ditions) a polynomial upper bound on the complexity of the problem. We show that in 
C/3 , some circumstances, though, equivalence queries cannot be modeled with a polynomial 

^ ' number of subcube identity and membership ones. We construct an example of an infi- 

nite Boolean basis with an exponential lower bound on the number of membership and 
subcube identity queries required for exact identification. We prove that for any finite 
subset of this basis, the problem remains polynomial. 
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1 Introduction 

Imagine a black box with an unknown Boolean function / of variables X = {xi,...,x„} 
hidden inside. Suppose that one has an opportunity to obtain correct answers to questions of 
two types: 

[i) if all of the variables from X are assigned specific values, i.e., Xj = aj for all Xj G X, 
what value does / have? 

{ii) if some of the variables from X are assigned specific values, i. e., Xj = Oj for Xj E X' C X, 
is the value of / determined unambiguously? 

How many questions does one have to ask in order to identify the function in the box exactly? 
Clearly, if there is no prior knowledge of /, one cannot do better than ask 2" questions in 
the worst case. Indeed, at the beginning the set of all possibilities consists of 2^ functions. 
Each question's answer is a single bit, so the height of a (binary) deterministic decision tree 
representing one's strategy cannot be less than log2 2^ =2". However, if one knows a priori 
that / belongs to a certain class C, the problem can become easier. A counting argument here 
gives the lower bound of log2|C|. 

In this paper, we consider classes C of Boolean functions which are read-once over various 
bases OS (formal definitions are given in section [2]). While questions of type (^) (member- 
ship queries) are fairly common for various learning problems (several settings for read-once 
functions are discussed in section [3]), questions of type (ii) [suhcuhe identity queries) ap- 
pear to have never been considered by researchers yet. In section 31 we introduce this new 
type of queries formally, define our learning model in detail and study the complexity of the 
considered problem (one of exact identification). 

Subsection 14. II is devoted to definitions and problem setting. In subsection 14. 2| we discuss 
a connection between subcube identity queries in our learning model and two of Valiant's 
classic necessity and possibility queries. We show that any algorithm using membership and 
subcube identity queries can be transformed into an algorithm using necessity and possibility 
queries, and vice versa. We also discuss possibility of using polynomial modeling techniques 
for another classic type of queries, namely, Angluin's equivalence ones. We demonstrate that 
subcube identity queries cannot be modeled with a polynomial number of equivalence ones. 
In subsection 14.31 we use membership and subcube identity queries to simulate an equivalence 
query with a polynomial overhead. We show that the considered problem of exact identifica- 
tion for read-once functions over finite Boolean bases from a wide class can be solved with 
a polynomial number of questions of type {i) and [ii) . We also demonstrate that if a related 
problem of checking read-once functions can be solved polynomially for a finite basis, then so 
does the considered problem of exact identification with membership and subcube identity 
queries. 

In subsection 14.41 we compare Angluin's learning model, which uses membership and 
equivalence queries, with our model, which uses membership and subcube identity ones. 
We provide an example of an infinite Boolean basis and show that an equivalence query 
in Angluin's model for this basis can be "exponentially more powerful" than membership 
and subcube identity ones. More formally, we show that a problem of identifying exactly 
an unknown function from a certain set can be solved with a single equivalence query, but 
requires an exponentially large number of membership and subcube identity ones. This means 
that an equivalence query, contrary to the results of the previous subsection, cannot generally 
be modeled with a polynomial number of membership and subcube identity ones. This fact 
also gives an example of an infinite Boolean basis such that read-once functions over this 
basis cannot be identified exactly with a polynomial number of queries in our model. In 



subsection 14.51 we prove that for any finite subset of this basis, this property does not hold 
and polynomial algorithms for exact identification of read-once functions exist. 

2 Preliminaries 

2.1 Basic definitions 

Suppose !B is a set of Boolean functions. We shall call OS a basis and use its functions to 
construct formulae. A formula 3" over *B is read-once if every variable in 3" appears exactly 
once. A Boolean function is said to be read-once over !B if it can be expressed with a read-once 
formula over OS. 

Read-once functions over {A, V, -■} are commonly called "read-once" without specifying a 
basis. Similarly, read-once functions over {A, V} are widely known as "monotone read-once". 
In this paper, though, we shall not use any of these terms. 

Suppose / is a Boolean function of variables X = {xi,...,x„}. A variable Xi €z X is 
essential for / iff there exist two vectors a and b differing only in ith component such that 
/(a) 7^ fib)- All variables that are not essential are called fictions. 

A partial assignment p to variables X is a mapping from X to {0,1,*}. We call an 
assignment total iff it takes all variables from X to {0, 1} (such an assignment is usually 
identified with a bit vector of length \X\). A total extension of a partial assignment p is any 
total assignment a such that p and a disagree only on variables from p^^{*). 

Let / be a Boolean function and p a partial assignment to its variables. Denote by fp a 
projection function obtained by "hardwiring" the values assigned by p to the corresponding 
inputs (whenever p takes Xi to *, the corresponding input is left untouched). In other words, 
fp is a function of variables X' = p~^(*), its domain comprising exactly 2' ' Boolean vectors 
of length \X'\. The value of fp on an input vector y is equal to f{x), where x is obtained by 
extending y with values from p. We say that a projection fp is induced by an assignment p. 

2.2 The problem of exact identification 

Consider the problem of learning described by Valiant [1] and Angluin [2]. The goal of learning 
is exact identification: given a black box with an unknown object from a known class C, one 
aims to determine which object is hidden in the box. Knowing a priori that the object belongs 
to the class C, one can use queries to bring out its properties and, ultimately, to identify it 
exactly. Queries are answered by honest and accurate oracles. 

In this paper, we are not interested in time complexity, but focus our attention on the 
number of queries performed by algorithms in the worst case. The algorithms, therefore, can 
be represented as deterministic decision trees. Nevertheless, one can easily check that all 
the algorithms run in polynomial time in terms of n (all our objects are Boolean functions 
of variables X = {xi, . . . ,Xn} as described below, so n is the number of variables), when 
represented as Turing machines. 

3 Learning read-once functions 

We consider a problem of learning (identifying exactly, see subsection 12. 2p read-once Boolean 
functions. This problem has been studied since paper pT]. In the setting being considered, a 
basis 55 and a set of variables X = {xi, . . . , Xn} are known a priori. The corresponding class 
C of objects being learnt is a set of Boolean functions (all functions of variables X which are 
read-once over 55), so the object in question (the target function /) can be regarded as an 



unknown concept or property. This idea gave names to various types of queries, suggested by 
Valiant and Angluin. 

3.1 Necessity and possibility queries 

Valiant's approach to learning read-once functions |lj suggested using three types of queries. 
We shall describe only the first one (the second and the third ones are related to the notion 
of Boolean functions' prime implicants). 

A necessity query takes a single partial assignment p as an input. The result of the query 
is "yes" if /p = 1, otherwise the result is "no". Valiant also defined a possibility query, which is 
dual to a necessity one. It also takes a single partial assignment p and returns "yes" iff fp ^ 0. 

3.2 Membership and equivalence queries 

Angluin's approach to learning [2] introduced other types of queries. We shall describe two 
of them. 

Membership queries allow one to learn the value of the target function / on a given 
input. Such a query takes an input x (a total assignment to variables X) and returns the 
corresponding value f{x). 

Equivalence queries allow one to determine whether the target function can be exactly 
represented with a given formula. The algorithm presents a formula S representing a Boolean 
function g, and the corresponding oracle determines whether / is equivalent to g. It either 
outputs "yes" or gives a counterexample y such that f{y) ^ gill)- We consider only proper 
equivalence queries, i. e. ones restricted to functions g from the class C (a more liberal setting 
could allow the use of an arbitrary Boolean function here) . 

One of the major early results in the area of learning read-once functions belongs to An- 
gluin, Hellerstein and Karpinski. In paper ||3] they describe an algorithm solving the problem 
for the basis of conjunction, disjunction and negation, using 0{n^) membership and 0{n) 
equivalence queries. Here n is the number of variables, i. e., the cardinality of X. In this pa- 
per, we shall always measure the number of queries performed by an algorithm as a function 
of n. 

An early generalization [3] of Angluin, Hellerstein and Karpinski's result allows the basis 
!B to contain arbitrary symmetric threshold functions. A threshold function is a one satisfying 
the condition 

f{xi, . . . ,Xn) = 1^ a\X\ + . . . + anXn > Oq 

for some real numbers ao,ai, . . . ,an- If none of ai,...,a„ is negative, then / is clearly 
monotone; if ai = . . . = a„, then / is symmetric. The following theorem belongs to Bshouty, 
Hancock, Hellerstein and Karpinski [4J. 

Theorem I. Read-once functions over the basis of arbitrary symmetric threshold functions 
are exactly identifiable with 0{n'^) nnembership and 0{n) equivalence queries. 

Further research in this area culminated in the following theorem due to Bshouty, Hancock 
and Hellerstein [^. 

Theorem II. Read-once functions over the basis of arbitrary constant I fan-in functions are 
exactly identifiable with 0(n'"'"^) membership and n equivalence queries. 



3.3 Subcube parity queries 

Paper |6] suggested studying a related problem of learning read-once functions with no fic- 
tious variables using subcube queries. The main goal is exact identification as described in 
subsection 12.21 but in this case all essential variables of the target function are also considered 
to be known a priori. In the setting considered in [6j, a learning algorithm can use membership 
queries defined in subsection 13.21 and subcube parity queries, which are defined as follows. 

Suppose / is an unknown target function and X = {xi,...,x„} is the set of all its 
essential variables. A subcube parity query takes a partial assignment p as an input and yields 
the parity (sum modulo 2) of all values of the induced projection fp on its 2'^ ' possible 
inputs (here X' = p~^{*))- The term "subcube parity" is determined by the observation that 
the values /(x) of the target function are summarized over an |X'| -dimensional subcube of 
the Boolean hypercube {0, 1}' L This subcube is restricted by p and consists of all possible 
inputs for fp. Note that a membership query is a particular case of a subcube parity query 
(for a total assignment p). 

A basis OS is called projection closed if any projection of a function from !B also lies in 
!B. We shall call 5S complex if it is projection closed and contains conjunction, disjunction 
and negation functions. For complex bases, the following criterion determining the power of 
subcube parity queries holds true [6]: 

Theorem III. Suppose ^ is a complex basis. Then read-once functions of variables X = 
{xi, . . . , Xn} over !B are exactly identifiable with a polynomial number of subcube parity queries 
iff all functions from 5S are read-once over the basis {A,V,-'}. If this is the case, there exists 
an algorithm using n'^ — n-\-l queries, otherwise an exponential number of queries is necessary 
for exact identification. 

3.4 The problem of checking 

We also need several results of research in a related area, that of a checking problem. Suppose 
C is a known class of objects, and one is given a black box with an unknown object from C. 
One is also given a hypothesis that the box contains a certain object f ^ C. One's task is 
to check whether this hypothesis is true or false. The class C, object / and available queries 
all depend on a specific setting. Note that the order of queries asked by an algorithm is not 
important in the checking problem: the task of the algorithm is simply to check whether all 
the answers are correct. Any such algorithm A, therefore, can be represented by a checking 
test Tj[ = {{q, q{f)) : A performs query q}, where q{f) is the result of q when addressed to 
/. One can see that T4 is simply a table of input queries and their return values for /. 

The problem of checking for read-once functions was set up in paper [7]. The considered 
class of objects consists of all read-once functions of variables X = {xi, . . . ,Xn} over an 
arbitrary basis *B, and a target function / is known to depend essentially on all the variables 
from X. The only available queries are membership ones. A checking test is a set T^ = 
{{x, f{x)) : A asks the value on x}. One may also identify a checking test with a set of inputs 
contained in it. 

This problem has been studied for various bases, both for individual functions and in a 
"uniform" setting (determining the number of queries sufficient for checking any read-once 
function of n variables). For arbitrary finite bases of functions of fan-in at most /, the following 
approach was suggested. 

Take a target read-once function / of variables xi, . . . , x„, (as stated above, all the variables 
are known to be essential). Let X' be a subset oi X = {xi, . . . ,x„} of size /. Suppose there 
exists a partial assignment p such that p{xi) = * i& Xi £ X' and the projection fp depends 



essentially on all variables from X' . In this case the set of all total assignments a extending 
p is called an essentiality hypercuhe for / satisfying the set of variables X' . An l-essentiality 
hypercube set for / is any set Hf containing essentiality hypercubes satisfying every subset 
X' C X of size I, whenever this is possible. If for a certain subset X' such a hypercube does 
not exist, no restriction is imposed on Hf. If an /-essentiality hypercube set for / contains 
essentiality hypercubes for all (") of /-sized subsets of X, the target function / is called 
l-satisfiable. 

Now denote by Bi the basis of all functions of fan-in at most /. The following theorem is 
proved in [8]: 

Theorem IV. Suppose I is an arbitrary natural number, I > 2. Let f be an l-satisfiable 
read-once function over Bi and Hf its l-essentiality hypercube set. Then the values of f on 
vectors from Hf constitute a checking test for f in the basis Bi . 

Note that under conditions of the theorem, the cardinality of Hf is at most (") -2 = 0{n), 
which is polynomial in terms of n = \X\. 

Unfortunately, for / = 3 and greater, there exist read-once functions over Bi which are 
not /-satisfiable. The key problem here lies in verifying the following conjecture: 

Proposition V (hypercube conjecture). Suppose I is an arbitrary natural number, I > 2. 

Let f be a read-once function over ^ (^ Bi. Then: 

(strong form) for any l-essentiality hypercube set Hf for f the values of f on vectors from 

Hf constitute a checking test for f in the basis 55; 
(weak form) there exists an l-essentiality hypercube set Hf for f such that the values of f 

on vectors from Hf constitute a checking test for f in the basis 53. 

The strong form of this conjecture for all *B C i?; was proved for / = 2 in paper [7] 
(the proof is also presented in Appendix, since main techniques in this area have not been 
available in English yet), for / = 3,4 in paper [8] and for / = 5 in paper |9]. It remains 
open for / > 6: neither form is proved for these values of /. Nevertheless, it is known that 
the strong form of the conjecture holds true for any finite basis containing no discriminatory 
functions. A function / of variables X is discriminatory if there exists a non-empty subset 
X' of X such that all projections fa for total assignments a to the variables X' have at least 
one fictions variable from X \ X' (all variables from X are considered essential for /). All 
discriminatory functions have at least 3 essential variables; all read-once functions over bases 
without discriminatory functions are /-satisfiable for any /. These results and several other 
ones can also be found in [8]. 

4 Subcube identity queries 

4.1 Definition and problem setting 

In this paper, we consider the problem of learning read-once functions in the following setting. 
The aim of learning is exact identification, as described in subsection 12.21 We do not impose 
any restrictions on the target function, similarly to the settings of subsection 13.2] and contrary 
to the settings of subsections 13.31 and 13.41 That is, we do not require all its variables to be 
essential, though we still consider the set X of input variables known a priori. Formally, if 
one is given a Boolean basis *B and a set of variables X, then the class C of objects being 
learnt is the set of all Boolean functions of variables X which are read-once over 53. Available 
queries are membership queries, as defined in subsection 13. 2| and subcube identity queries, 
which are defined as follows. 

6 



An input to a subcube identity query is a partial assignment p to variables from X. The 
corresponding oracle determines the induced projection fp, as described in subsection 13. 3| 
and then checks whether fp = b for either 6 = or 6 = 1. If so, the oracle outputs "yes", 
otherwise it outputs "no", but does not give any further information. 

Note that the result of a subcube identity query, unlike that of an equivalence one, is 
always a single bit. Also note that if the input projection p is total, then the oracle always 
outputs "yes", so "zero-dimensional" queries (i.e., those providing total assignments) are of 
no use. In fact, we could even change our definition of the oracle so that it would output f{p) 
if p is total. This modified definition would then generalize one of the membership oracle, 
similarly to subcube parity case. 

Our goal now is to determine the power of subcube identity queries. In subsection 14. 2| we 
demonstrate that subcube identity queries cannot be modeled with a polynomial number of 
equivalence ones. We also discuss a connection between subcube identity queries and Valiant's 
necessity and possibility ones. In subsection 14.31 we show that in some circumstances subcube 
identity queries can serve as a substitute for equivalence ones. In subsection 14.41 though, we 
provide an example of a Boolean basis such that this property does not hold. A known border 
between polynomial and exponential complexity of the considered problem is discussed in 
subsection 14.51 



4.2 Some remarks on modeling 

Note that if membership queries are not available in a learning model, then subcube identity 
queries can turn out significantly more powerful than classic equivalence ones: 

Theorem 1. The problem of exact identification for non-constant read-once functions over 
the basis {A,V} can be solved by an algorithm performing 0{n'^) subcube identity queries. 

Proof. Note that for all non-constant read-once functions / over {A, V}, the value of / on the 
vector 1 = (1, . . . , 1) is 1. One can use an algorithm from [3], which uses 0{n'^) membership 
queries to perform exact identification. Since for monotone Boolean functions /(a) = 1 iff 
f{a') = 1 for all vectors a' such that a < a' < 1, a membership query for a can be simulated 
with a subcube identity query for a partial assignment p such that p{xi) = 1 iff a(xj) = Oj = 1 
andp-^(O) = 0. D 

Angluin, Hellerstein and Karpinski proved [3j that the same problem cannot be solved 
with any polynomial number of equivalence queries. Combined with the result of the theorem, 
this means that subcube identity queries cannot be modeled with a polynomial number of 
equivalence ones. Whether this holds true in the presence of membership queries, is an open 
problem. Possibility of modeling equivalence queries with a polynomial number of membership 
and subcube identity ones is considered in subsections 14.31 and 14.41 

It must be remarked that subcube identity queries are closely related to Valiant's necessity 
and possibility queries. More strictly, a subcube identity query for a partial assignment p can 
be modeled with one necessity and one possibility query for p. Indeed, if the necessity query 
returns "yes", then the subcube identity query should also return "yes". If the possibility query 
returns "no", then the subcube identity query should still return "yes". In all other cases, the 
subcube identity query should return "no". What's more, both necessity and possibility queries 
can be modeled with one subcube identity and one membership query. If p is total, then 
modeling is trivial (no subcube identity queries are needed). In the other case, if a subcube 
identity query returns "no", both those queries should return "no". Otherwise, a membership 
query for an arbitrary total extension of p allows to decide which of them should return "yes" 
(the other should return "no"). 



4.3 Modeling equivalence queries in finite bases 

In this subsection, we demonstrate that under certain conditions equivalence queries can be 
simulated with membership and subcube identity ones. We describe the technique of modeling 
in circumstances allowing only polynomial overhead. The key fact is stated in the following 
lemma: 

Lemma 2. Suppose ^ is a finite basis for which hypercuhe conjecture holds true. Then an 
equivalence query for a read-once function over 53 can be modeled with 0{n^) membership and 
0{n^) subcube identity queries, where I is maximum fan-in of functions from 53. 

Proof. Suppose /(xi, . . . , Xn) is a target function and g \s a, function supplied to the equiva- 
lence oracle. The oracle needs to check whether f ^ g and, if so, output "yes", otherwise give 
a counterexample y such that f{y) ^ g{y)- 

Note that (7 is a read-once function over 53. Denote by g' a function obtained from g by 
eliminating all its fictions variables. Since hypercube conjecture holds true for !B, one can 
construct a checking test T' for g' containing 0{n^) answer — proof pairs. Take an arbitrary 
total assignment a for fictions variables of g and extend all input vectors from T' with a. 
The obtained set of pairs {x,g{x)) constitutes a membership query table T for g. We now 
demonstrate how T can be used to simulate an equivalence query. 

To reach the desired goal, we run a membership query for each input vector x contained 
in pairs from T. Denote by 6 a result of the query. Clearly, b = f{x). If for some x we have 
b 7^ g{x), then we output x and terminate the modeling. Otherwise, since T' is a checking 
test for g' , we conclude that g' = fa, where fa is the corresponding projection. 

Now we must check whether the equality g{x) = f{x) holds for all x. For each pair 
{x',g{x')) in T', run a subcube identity query for a partial assignment p obtained from x' by 
assigning * to all variables lacking values. If all such queries give "yes" answers, then g = f, 
so we output "yes". Indeed, since T' is a checking test for g', in this case we know that g' is 
equivalent to all projections of / induced by partial assignments a' which assign arbitrary 
constant values to fictious variables of g. This means that all fictious variables of g are also 
fictions for /, so / = g. Note that in this case 0{n) membership and 0{n) subcube identity 
queries are used. 

Suppose now that a subcube identity query for some p returns "no". In this case we can 
find a total assignment a such that / and g disagree on a, and output a corresponding input 
vector. The procedure performing this task is denoted by S{p) and defined as follows. Let 
Xi be a variable such that p{xi) = *. Denote by pf, a partial assignment obtained from p by 
changing the value of p{xi) to b. If such an assignment is total, then we run a membership 
query for one of the total extensions of p and determine the input x such that f{x) ^ g{x). 
Otherwise, we run a subcube identity query for pQ. If it returns "no", we forget about p and run 
S{pq). If the query returns "yes", we run another subcube identity query for pi. The answer 
"no" makes us run 5(pi), and the answer "yes" means that projections fp^ and /p^ disagree on 
all inputs and we can use a single membership query for choosing one with property fp^ ^ gp^ 
and going on. Thus, S{p) always terminates and requires 0{n) queries for any n-variable 
functions / and g. 

Note that without loss of generality, / > 1, otherwise 53 C {0, 1} and an equivalence 
query can be modeled with a single membership query. Hence, 0{n) + 0{n) = 0{n), which 
concludes the proof. D 

The main result of this subsection is formulated as follows: 



Theorem 3. Suppose ^ is a finite basis for which hypercube conjecture holds true. Then 
read-once functions over 5S are exactly identifiable with 0{n^'^) membership and 0{n^^) 
subcube identity queries, where I is maximum fan-in of functions from 5S. 

Proof. Applying Lemma [2] to an algorithm for exact identification using 0{n^~^'^) member- 
ship and n equivalence queries (see Theorem |Tl]) yields a desired algorithm. The number of 
membership queries is ©(n'"*"^) +n-0(n') = 0(n'"'"^), the number of subcube identity queries 
is n ■ 0{n^) = 0(n'+i). D 

For now, we can say that all the conditions are satisfied in the particular cases described 
in the following corollary. 

Corollary 4. Suppose ^ is a finite basis. Also suppose that 5S contains either no discrim- 
inatory functions or no functions of fan-in 6 and greater. Then read-once functions over 53 
are exactly identifiable with a polynomial number of membership and subcube identity queries. 

4.4 Lower bound for one infinite basis 

In this subsection we consider the basis of arbitrary monotone threshold functions. Our key 
argument refers to learning the functions of the basis themselves. 

Note that for each natural n > 2 and for every real s the following symmetric function is 
monotone and threshold: 

f{xi, ... ,Xn) = 1 ^ Xi-\- . . . -\-Xn> S. 

Assume k = [n/2\ and s = k -\- 1. Increasing k coefficients by ^^ and setting s to k -\- 2 
yields a new monotone threshold function, which disagrees with / on a single input vector 
containing exactly k ones. Let C„ be the set of all (^) such functions and /. 

Lemma 5. The problem of exact identification of an unknown function from Cn: 
(a) can be solved with a single equivalence query, but 
(6) cannot be solved with less than (^) membership and subcube identity queries. 

Proof. The first part is straightforward, because an equivalence query for / solves the prob- 
lem. We now use an adversary argument to prove the second part. If the queries used are 
all membership, then the desired is also straightforward. For subcube identity queries, we 
observe that the only reasonable ones are those which supply a partial assignment p allowing 
a unique total extension a with exactly k ones. Indeed, if this is not the case, then p itself 
either has at least k -\- 1 ones (or at least n — k -\- 1 zeros; in both cases all corresponding 
projections fp are constant) or allows two different total extensions with k — 1 and k -\- 1 
ones, respectively (all corresponding projections are non-constant). Hence, given that the 
target function is taken from the set C„ defined above, each query can only reveal its value 
on a single input vector containing k ones. Thus, if less than (^) queries have been asked, 
an imaginary adversary can always conceive of two suitable functions: the first is / and the 
second disagrees with / on an input vector which has not been inquired yet. D 

This lemma implies that subcube identity queries do not possess the same power as equiv- 
alence ones. More precisely, one cannot use modeling techniques to substitute membership 
and subcube identity queries for equivalence ones with a polynomial overhead only. We also 
obtain the following statement concerning exact identification of monotone threshold func- 
tions: 



Theorem 6. Monotone threshold functions of n variables require at least (i^^i) membership 
and subcube identity queries for exact identification. 

Since every basis function is read-once by definition, we obtain the following lower bound 
on the number of queries needed for solving our main problem: 

Corollary 7. Read-once functions of n variables over the basis of all monotone threshold 
functions require at least (i^^i) membership and subcube identity queries for exact identifi- 
cation. 

4.5 Polynomial vs. exponential complexity border 

In this subsection we discuss a border between polynomial and exponential complexity for 
our setting. Observe that the following statement holds true: 

Claim 8. No threshold function is discriminatory. 

Proof. Without loss of generality, take a monotone threshold function g{xi, . . . , x„). Suppose 
that 

g{xi, ... ,Xn) = 1^ G{xi, . . . , x„) > 0, 

where G{xi, . . . Xn) = aixi + . . .-|-a„x„— ao for some non-negative real numbers qq, ai, . . . , a„. 
It is sufficient to show that if a variable Xi is fictious for g, then all the variables Xj such that 
Oij < CKj are also fictious. Indeed, once this fact is proved, one may observe that whenever 
all the projections fa of a monotone threshold function / of variables X induced by total 
assignments to any fixed subset of X have at least one fictious variable, they must also share 
a common fictious variable, which then turns out fictious for /. 

So, suppose that Xj is a fictious variable and aj < ai. Without loss of generality, assume 
that i = n — 1 and j = n. Then for all xi, . . . , Xn-2 € {0, 1} the following inequalities hold 
true: 

G(xi, . . . , x„_2, 0, 0) < G{xi, ... , x„_2, 0, 1) < G{xi, ... , a:„_2, 1, 0). 

Since x„_i is fictious, the leftmost and the rightmost expressions above are either both 
negative or both non-negative, and, obviously, so does the expression in the center. The same 
reasoning also holds true for inequalities 

G(xi, . . . , X„_2, 0, 1) < G{xi, ... , X„_2, 1, 0) < G{xi, ... , Xn-2, 1, !)• 

This means that g{xi, . . . , x„_2, a:„_i, 0) is always equal to g{xi, . . . , Xn-2) a^n-i) 1)) regardless 
of x„_i € {0, 1}. So, Xn is fictious for g, which gives the desired. D 

We know now that the infinite basis of arbitrary monotone threshold functions contains no 
discriminatory functions, and so hypercube conjecture holds true for an arbitrary finite sub- 
basis. Hence, since (u^i) ~ 2"/Y^7rn/2, we obtain the following border between polynomial 
and exponential complexity of exact identification: 

Theorem 9. The problem of exact identification of read-once functions over the basis of 
arbitrary monotone threshold functions requires an exponential number of membership and 
subcube identity queries (in terms of the number of variables), but the same problem for an 
arbitrary finite subbasis can be solved with a polynomial number of queries. 
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5 Open problems 

We conclude this paper by formulating three open problems concerning subcube identity 
queries: 

1. Theorem [1] reveals that in some cases subcube identity queries can prove more useful 
than equivalence ones. Does the same property hold true when equivalence queries are 
"supported" by membership ones? In what circumstances can subcube identity queries 
be modeled using equivalence and membership ones with a polynomial overhead only? 

2. Theorem [3] establishes an 0(n'"*"^) upper bound on the number of queries needed for 
exact identification of read-once functions over bases of fan-in I and less, for / < 5. Is 
this bound tight in terms of O(-) or does there exist a better algorithm than the one 
from [5] where equivalence queries are modeled with membership and subcube identity 
ones? 

3. To what degree may the polynomial vs. exponential border of Theorem [9] be refined? In 
other words, what is the complexity of exact identification of read-once functions over 
infinite bases of monotone threshold functions? One may be interested, for instance, in a 
characterization of infinite bases of monotone threshold functions which allow learning 
read-once functions with a polynomial number of membership and subcube identity 
queries. 
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Appendix 

We shall prove hypercube conjecture for 1 = 2. Denote by B2 the basis of all functions of fan- 
in 2 or less. It is trivial to check that any read-once function over B2 can be represented by a 
read-once formula over the basis i?2 = {A, V, ©, 0, -1, 1,0} (here is a XOR of 2 arguments 
and is its negation). We shall represent formulae as rooted trees with labeled vertices. We 
shall place leaves of such a tree at the bottom, and root on the top. Any read-once formula 
over i?2 can be transformed so that its tree would satisfy the following conditions: 

1) any vertex labeled with "1" or "0" must be the only vertex in a tree; 

2) all leaves are labeled with different variables or their negations {literals)] 

3) all other vertices are labeled with linear (0, 0) or non-linear (A, V) symbols represent- 
ing corresponding functions of fan-in 2 or greater; 

4) adjacent vertices cannot be labeled with identical symbols or with different linear sym- 
bols; 

5) any vertex u lying directly below (adjacent to) a vertex v labeled with a linear symbol 
cannot be labeled with A or a negation of a variable. 

Any rooted tree satisfying five conditions above is called a canonical tree. Any such tree repre- 
sents a read-once Boolean function over B2. Conversely, any such function can be represented 
by a canonical tree. The uniqueness of such a tree will be proved later. 

Let X = {xi, . . . ,x„} and suppose that / is a read-once Boolean function of variables 
X over i?2. An essentiality square for variables Xi,Xj ^ X {i ^ j) is a set of four vectors 
differing only in ith and jth components such that / restricted to the set of these vectors 
depends essentially on both Xi and Xj. In other words, these four vectors constitute the set 
of all total extensions of such a projection p that p~^{*) = {xi,Xj} and fp does not have any 
fictious variables. An essentiality square set for / is any set of Boolean vectors of length n 
containing an essentiality square for all pairs {xi,Xj} C X. One can easily see that for all 
such pairs an essentiality square exists. 

A glueing of a canonical tree is a rooted tree obtained from a canonical tree by performing 
the following operations: 

1. Replacing all linear symbols with and all non- linear symbols with 1. 

2. Contracting adjacent vertices labeled with 1. 

3. Replacing literals of the form Xj with corresponding variables Xj. 

We also need the following concepts from graph theory. A graph on vertices X is a 
cograph iff it is reducible to an empty graph on X by repeatedly complementing its connected 
components. Suppose that T is a rooted tree with leaves X and no vertices with exactly one 
child. Also suppose that non-leaf vertices of T are properly coloured with and 1 (no two 
adjacent vertices have the same colour). Any tree satisfying these conditions is called a cotree. 
Denote by <^(T) a graph on vertices X such that {xj,Xj} is an edge in (j){T) iff the lowest 
common ancestor of Xj and Xj is coloured with 1 in T. 
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Claim 10 (|10|). The mapping (j) is a bijection between the set of all cotrees with leaves X 
and the set of all cographs on vertices X . 



We shall use the following notation: 






Xi, if (7 = 1, 

X4, if (7 = 0. 



^tj 



Lemma 11 (glueing lemma). A glueing T of an arbitrary canonical tree for a read-once 
function over B2 is uniquely determined by the values of f on the vectors of any essentiality 
square set for f . 

Proof. For an arbitrary canonical tree T, its glueing T is unique. Let Ti be a canonical tree for 
/ and Ti its glueing. Note that for all ai,aj,a S {0, 1} the linearity of a function ( Xj ' o x -^ I , 

where o E {A,V,©,©}, coincides with the linearity of a function Xi oxj (in other words, with 
the linearity of a symbol o). This means that an edge {xi,Xj} belongs to the set of edges of 
the graph (t){Ti) iff all essentiality squares for variables Xi,Xj have non-linear projections of 
/. Hence, T = Ti, the glueings of all canonical trees for / are identical, and the values of / 
on an essentiality square set uniquely determine </)(T) and, by Claim [TOl T. D 

A rooted subtree T' of a canonical tree T is called a fragment of a canonical tree T iff it 
satisfies the following conditions: 

1) T' has at least one non-leaf vertex; 

2) either the root of T' is the root of T, or the vertex adjacent to the root of T' and lying 
above it is linear; 

3) all vertices of T' lie in T below the root of T'; 

4) all linear vertices from T that are also in T' are leaves in T'; 

5) all non-linear vertices from T that are also in T' are not leaves in T'; all their children 
are in T'. 

Lemma 12 (fragment lemma). Suppose that f is a read-once function over B2 and one knows 
a glueing T of a canonical tree T. Also suppose that all children of a vertex v ofT, which is 
labeled with 1 and corresponds to a fragment T' , are leaves in T. Then one can unambiguously 
reconstruct T' using the values of f on the vectors from an essentiality square set for f . 

Proof. The reconstruction of T' can be performed in two steps. At first, we shall reconstruct 
two variants of leaves' labels. Consider the leaves labeled with literals x^' and x-^ [ai and Gj 
are unknown). Since Boolean conjunction and disjunction are both monotone, all projections 

of / onto any essentiality square for xi and Xj have the form ( x^ * o x -^ 1 , where o £ {A, V} 

and a £ {0, 1}. Hence, if such a projection is monotone or antimonotone in both its variables, 
then ai = aj, otherwise cjj 7^ aj. This means that the values of / on the vectors from an 
essentiality square set determine two possible vectors of cr's for leaves of T' , which differ in 
every single component. 

Take any of these vectors and assume that it is the correct one. Now we can reconstruct 
the whole unknown fragment. Consider two leaves of T' labeled with x- ' and x-^ , respectively. 
Determine the label o E {A, V} of the lowest common ancestor of these leaves in T. We shall 
use the values of / on the corresponding essentiality square. The associated projection is a 
conjunction or a disjunction of Xj* and x-^ (or its negation), so there exists such a Boolean 
vector 5 = ((^1, (^2) that the values of this projection on all vectors 7 7^ <5 differ from its value 
on 6. If the lowest common ancestor of the considered leaves is labeled with A, it follows 
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that 5 = {(Ti^Gj). Otherwise, if the lowest common ancestor is labeled with V, it follows that 
(5 = (ai,aj). This means that the unknown fragment can be reconstructed with the technique 
of Claim [ini 

Note that the inverse vector of cr's corresponds to the same fragment tree with dual 
labels (symbols A and V are said to be dual to each other). By De Morgan's laws, functions 
represented by these trees are each other's negation. If the root of T' is also a root of T, then 
the right tree can be chosen using the value of / on any input. If this is not the case, the root 
of T' , according to the clause [5] of the definition of a canonical tree, cannot be labeled with 
A, which eliminates one of the variants. D 

Theorem 13. Let f be a read-once function over B2 and Mf an essentiality square set for 
f. Suppose that one knows the values of f on all vectors from. Mf. Then one can reconstruct 
a unique canonical tree for f . 

Proof. At first, one can reconstruct a unique glueing T of a canonical tree T for /, using 
glueing lemma. Then for each vertex in T labeled with 1 and having no descendants except 
for leaves, one can reconstruct an associated fragment of T, using fragment lemma. Suppose 
that T contains a vertex v labeled with such that all its descendants are leaves (labeled 
with Xj^, . . . Xjp) and vertices labeled with 1 which have already been considered (with cor- 
responding subtrees representing functions /,j , . . . , fj^ ) . Also suppose that v has not been 
considered yet. Perform a substitution xt = Xj^ © . . . © Xi^ © /jj © ... © /j^, where t is a new 
natural number, unique for each v. Such a substitution transforms an essentiality square set 
for / into an essentiality square set for a new function obtained from /. After that, one can 
continue the reconstruction of a canonical tree for /. If the following steps prove that the leaf 
corresponding to xt should be labeled with xt, then the associated vertex in T is labeled with 
©, otherwise it is labeled with ©. If f is a root vertex in T, then the label of a root vertex in 
T is determined by the value of / on any single input. D 

Corollary 14. Every function f which is read-once over B2 has a unique canonical tree. 

Corollary 15. Suppose f is a read-once function over B2 and Mf is an essentiality square 
set for f. Then Tf = {(x,/(x)): x G -^/} is a checking test for f in the basis B2, and its 
cardinality \Tf\ is less or equal to 4(2) = O^n"^). 
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